The aim of this practical is to enhance your understanding of multiple imputation, in general. You will learn how to pool the results of analyses performed on multiply-imputed data, how to approach different types of data and how to avoid the pitfalls researchers may fall into. The main objective is to increase your knowledge and understanding on applications of multiple imputation.

Again, we start by loading (with library()) the necessary packages and fixing the random seed to allow for our outcomes to be replicable.

library(mice)
library(lme4)
library(mitml)
set.seed(123)

All the best,

Gerko and Stef


The popularity data set

We are going to work with the popularity data from Joop Hox (2010). The variables in this data set are described as follows:

pupil \(\quad\) Pupil number within class
class \(\quad\) Class number
extrav \(\quad\) Pupil extraversion
sex \(\quad\) Pupil gender
texp \(\quad\) Teacher experience (years)
popular \(\quad\) Popularity as rated by the pupil’s peers
popteach \(\quad\) Popularity as rated by the teacher

Obtaining the popularity data

A workspace with complete and incomplete versions of the popularity data can be obtained here or can be loaded into the Global Environment by running:

con <- url("https://www.gerkovink.com/mimp/popular.RData")
load(con)

This workspace contains several datasets and functions that, when loaded, are available to you in R. If you’d like to see what is in your Gloval Environment after importing the workspace: run the following code

ls()
## [1] "con"                      "icc"                     
## [3] "mice.impute.2lonly.mean2" "popMCAR"                 
## [5] "popMCAR2"                 "popNCR"                  
## [7] "popNCR2"                  "popNCR3"                 
## [9] "popular"

The popular data set is the complete version. The other popXXX sets are incomplete versions of this data set with varying levels of complexity. The icc() function is a generic function from package multilevel (function ICC1()).

Having the complete version of the data (the popular set) is a luxury that we do not have in practice. However, for educational purposes we can use this simulated set and its incomplete versions to inspect, evaluate and verify our imputation methodology and approach.


The dataset popNCR is a variation on the Hox (2010) data, where the missingness in the variables is either missing at random (MAR) or missing not at random (MNAR). We will be working with this popNCR set in this practical.


Inspecting the popNCR data


1. Check with the functions head(), dim() - alternatively one could use nrow() and ncol() instead of dim() - and summary() how large the dataset is, of which variables the data frame consists and if there are missing values in a variable.

head(popNCR)
##   pupil class extrav  sex texp popular popteach
## 1     1     1      5    1   NA     6.3       NA
## 2     2     1     NA    0   24     4.9       NA
## 3     3     1      4    1   NA     5.3        6
## 4     4     1      3 <NA>   NA     4.7        5
## 5     5     1      5    1   24      NA        6
## 6     6     1     NA    0   NA     4.7        5
dim(popNCR)
## [1] 2000    7
nrow(popNCR)
## [1] 2000
ncol(popNCR)
## [1] 7
summary(popNCR)
##      pupil           class          extrav         sex           texp     
##  Min.   : 1.00   17     :  26   Min.   : 1.000   0   :661   Min.   : 2.0  
##  1st Qu.: 6.00   63     :  25   1st Qu.: 4.000   1   :843   1st Qu.: 7.0  
##  Median :11.00   10     :  24   Median : 5.000   NA's:496   Median :12.0  
##  Mean   :10.65   15     :  24   Mean   : 5.313              Mean   :11.8  
##  3rd Qu.:16.00   4      :  23   3rd Qu.: 6.000              3rd Qu.:16.0  
##  Max.   :26.00   21     :  23   Max.   :10.000              Max.   :25.0  
##                  (Other):1855   NA's   :516                 NA's   :976   
##     popular         popteach     
##  Min.   :0.000   Min.   : 1.000  
##  1st Qu.:3.900   1st Qu.: 4.000  
##  Median :4.800   Median : 5.000  
##  Mean   :4.829   Mean   : 4.834  
##  3rd Qu.:5.800   3rd Qu.: 6.000  
##  Max.   :9.100   Max.   :10.000  
##  NA's   :510     NA's   :528

The data set has 2000 rows and 7 columns (variables). The variables extrav, sex, texp, popular and popteach contain missings. About a quarter of these variables is missing, except for texp where 50 % is missing.


2. As we have seen before, the function md.pattern() is used to display all different missing data patterns. How many different missing data patterns are present in the popNCR dataframe and which patterns occur most frequently in the data?

md.pattern(popNCR, plot = FALSE)
##     pupil class sex popular extrav popteach texp     
## 308     1     1   1       1      1        1    1    0
## 279     1     1   1       1      1        1    0    1
## 110     1     1   1       1      1        0    1    1
## 115     1     1   1       1      1        0    0    2
## 114     1     1   1       1      0        1    1    1
## 98      1     1   1       1      0        1    0    2
## 33      1     1   1       1      0        0    1    2
## 24      1     1   1       1      0        0    0    3
## 119     1     1   1       0      1        1    1    1
## 113     1     1   1       0      1        1    0    2
## 50      1     1   1       0      1        0    1    2
## 75      1     1   1       0      1        0    0    3
## 29      1     1   1       0      0        1    1    2
## 21      1     1   1       0      0        1    0    3
## 2       1     1   1       0      0        0    1    3
## 14      1     1   1       0      0        0    0    4
## 102     1     1   0       1      1        1    1    1
## 89      1     1   0       1      1        1    0    2
## 25      1     1   0       1      1        0    1    2
## 29      1     1   0       1      1        0    0    3
## 85      1     1   0       1      0        1    1    2
## 56      1     1   0       1      0        1    0    3
## 9       1     1   0       1      0        0    1    3
## 14      1     1   0       1      0        0    0    4
## 19      1     1   0       0      1        1    1    2
## 27      1     1   0       0      1        1    0    3
## 13      1     1   0       0      1        0    1    3
## 11      1     1   0       0      1        0    0    4
## 4       1     1   0       0      0        1    1    3
## 9       1     1   0       0      0        1    0    4
## 2       1     1   0       0      0        0    1    4
## 2       1     1   0       0      0        0    0    5
##         0     0 496     510    516      528  976 3026
md.pattern(popNCR)

There are 32 unique patterns. The pattern where everything is observed and the pattern where only texp is missing occur most frequently.

If we omit texp, then half the patterns disappear:

md.pattern(popNCR[, -5], plot = FALSE)
##     pupil class sex popular extrav popteach     
## 587     1     1   1       1      1        1    0
## 225     1     1   1       1      1        0    1
## 212     1     1   1       1      0        1    1
## 57      1     1   1       1      0        0    2
## 232     1     1   1       0      1        1    1
## 125     1     1   1       0      1        0    2
## 50      1     1   1       0      0        1    2
## 16      1     1   1       0      0        0    3
## 191     1     1   0       1      1        1    1
## 54      1     1   0       1      1        0    2
## 141     1     1   0       1      0        1    2
## 23      1     1   0       1      0        0    3
## 46      1     1   0       0      1        1    2
## 24      1     1   0       0      1        0    3
## 13      1     1   0       0      0        1    3
## 4       1     1   0       0      0        0    4
##         0     0 496     510    516      528 2050
md.pattern(popNCR[, -5])

Without the fifth column [, -5] which is texp, there are only 16 patterns.


Where does the missingness come from?


Missingness inspection of the other variables.


4. Does the missingness of the other incomplete variables depend on popteach? If yes, what is the direction of the relation?

histogram(~ popteach | is.na(sex), data = popNCR)  

There seems to be a left-tailed relation between popteach and the missingness in sex.

histogram(~ popteach | is.na(extrav), data = popNCR)

There also seems to be a left-tailed relation between popteach and the missingness in extrav.

histogram(~ popteach | is.na(texp), data = popNCR)

There seems to be no observable relation between popteach and the missingness in texp. It might be random or even MNAR.


5. Find out if the missingness in teacher popularity depends on pupil popularity.

histogram(~ popular | is.na(popteach), data = popNCR)

Yes: there is a dependency. The relation seems to be right-tailed.


We now know that some variables (columns) seem to be related to the missingness in other variables. This is a clear indication that for the relevant variables, the predictor relations during multiple imputation should be set.


6. Have a look at the intraclass correlation (ICC) for the incomplete variables popular, popteach and texp.

icc(aov(popular ~ as.factor(class), data = popNCR))
## [1] 0.328007
icc(aov(popteach ~ class, data = popNCR))
## [1] 0.3138658
icc(aov(texp ~ class, data = popNCR))
## [1] 1

Please note that the function icc() comes from the package multilevel (function ICC1()), but is included in the workspace popular.RData. Make a note of the ICC’s, you’ll need them later in Practical 3.


7. Do you think it is necessary to take the multilevel structure into account?

YES! There is a strong cluster structure going on. If we ignore the clustering in our imputation model, we may run into invalid inference. To stay as close to the true data model, we must take the cluster structure into account during imputation.


Multiple Imputation of the popNCR data set


8. Impute the popNCR dataset with mice using imputation method norm for popular, popteach, texp and extrav. Exclude class as a predictor for all variables. Name the multiply imputed data set (mids) resulting from the mice call imp1. Use \(m=10\) imputations and \(maxit = 15\) iterations.

#create adjusted imputation method vector
meth <- make.method(popNCR)
meth[c("popular", "popteach", "texp", "extrav")] <- "norm"
meth
##    pupil    class   extrav      sex     texp  popular popteach 
##       ""       ""   "norm" "logreg"   "norm"   "norm"   "norm"
#create adjusted predictorMatrix
pred <- make.predictorMatrix(popNCR)
pred[, "class"] <- 0
pred[, "pupil"] <- 0
pred
##          pupil class extrav sex texp popular popteach
## pupil        0     0      1   1    1       1        1
## class        0     0      1   1    1       1        1
## extrav       0     0      0   1    1       1        1
## sex          0     0      1   0    1       1        1
## texp         0     0      1   1    0       1        1
## popular      0     0      1   1    1       0        1
## popteach     0     0      1   1    1       1        0
#generate imputations
imp1 <- mice(popNCR, m = 10, maxit = 15, 
             method = meth, 
             predictorMatrix = pred, 
             print = FALSE)

Inspecting convergence


9. Inspect convergence of the mice algorithm by studying the traceplots

plot(imp1)

Convergence has been reached for some variables, but is not convincing for all. However, we’ve not done any justice to the structure of the data - we’ve imputed the data as a flat file without paying any attention to the clustering in the data. The imputation model can be greatly improved, which we will see in the next practical.


Pooling the analysis


9. Run the following model on the imputed data, the incomplete data and compare the model to the ‘true’ data

  • For the true data:
library(lme4)
fit.true <- with(popular, lmer(popular ~ 1 + (1|class)))
summary(fit.true)
## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ 1 + (1 | class)
## 
## REML criterion at convergence: 6330.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.5655 -0.6975  0.0020  0.6758  3.3175 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  class    (Intercept) 0.7021   0.8379  
##  Residual             1.2218   1.1053  
## Number of obs: 2000, groups:  class, 100
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  5.07786    0.08739    58.1
  • For the incomplete data:
fit.true <- with(popNCR, lmer(popular ~ 1 + (1|class)))
summary(fit.true)
## Linear mixed model fit by REML ['lmerMod']
## Formula: popular ~ 1 + (1 | class)
## 
## REML criterion at convergence: 4727.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.2577 -0.6985 -0.0257  0.6649  3.2828 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  class    (Intercept) 0.6298   0.7936  
##  Residual             1.2078   1.0990  
## Number of obs: 1490, groups:  class, 100
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  4.91988    0.08452   58.21
  • For the imputed data:
fit <- with(imp1, lmer(popular ~ 1 + (1|class)))
pool <- pool(fit)
summary(pool)
##             estimate std.error statistic       df p.value
## (Intercept) 5.013788 0.0772285  64.92147 1731.811       0

We may obtain the variance components by the testEstimates() function from package mitml:

library(mitml)
testEstimates(as.mitml.result(fit), var.comp = TRUE)$var.comp
##                             Estimate
## Intercept~~Intercept|class 0.5153594
## Residual~~Residual         1.3231304
## ICC|class                  0.2803254

There are quite some differences between the models. With multiple imputation, we are able to correct the inference quite to some extend (e.g. the fixed effects), but we miss out on correcting the inference on the level of the random effects. Then again, we completely ignored the clustering structure in our data.


Investigating the imputations


Simple comparison of column means


10. Compare the means of the variables in the first imputed dataset and in the incomplete dataset.

summary(complete(imp1))
##      pupil           class          extrav        sex     
##  Min.   : 1.00   17     :  26   Min.   : 0.3408   0: 973  
##  1st Qu.: 6.00   63     :  25   1st Qu.: 4.0000   1:1027  
##  Median :11.00   10     :  24   Median : 5.0000           
##  Mean   :10.65   15     :  24   Mean   : 5.2210           
##  3rd Qu.:16.00   4      :  23   3rd Qu.: 6.0000           
##  Max.   :26.00   21     :  23   Max.   :10.0000           
##                  (Other):1855                             
##       texp           popular         popteach     
##  Min.   :-7.887   Min.   :0.000   Min.   : 1.000  
##  1st Qu.: 8.000   1st Qu.:4.044   1st Qu.: 4.000  
##  Median :12.976   Median :5.000   Median : 5.000  
##  Mean   :12.612   Mean   :4.986   Mean   : 5.011  
##  3rd Qu.:17.000   3rd Qu.:5.900   3rd Qu.: 6.000  
##  Max.   :33.472   Max.   :9.100   Max.   :10.000  
## 
summary(popNCR)
##      pupil           class          extrav         sex           texp     
##  Min.   : 1.00   17     :  26   Min.   : 1.000   0   :661   Min.   : 2.0  
##  1st Qu.: 6.00   63     :  25   1st Qu.: 4.000   1   :843   1st Qu.: 7.0  
##  Median :11.00   10     :  24   Median : 5.000   NA's:496   Median :12.0  
##  Mean   :10.65   15     :  24   Mean   : 5.313              Mean   :11.8  
##  3rd Qu.:16.00   4      :  23   3rd Qu.: 6.000              3rd Qu.:16.0  
##  Max.   :26.00   21     :  23   Max.   :10.000              Max.   :25.0  
##                  (Other):1855   NA's   :516                 NA's   :976   
##     popular         popteach     
##  Min.   :0.000   Min.   : 1.000  
##  1st Qu.:3.900   1st Qu.: 4.000  
##  Median :4.800   Median : 5.000  
##  Mean   :4.829   Mean   : 4.834  
##  3rd Qu.:5.800   3rd Qu.: 6.000  
##  Max.   :9.100   Max.   :10.000  
##  NA's   :510     NA's   :528

11. The missingness in texp is MNAR: higher values for texp have a larger probability to be missing. Can you see this in the imputed data? Do you think this is a problem?

Yes, we can see this in the imputed data: teacher experience increases slightly after imputation. However, texp is the same for all pupils in a class. But not all pupils have this information recorded (as if some pupils did not remember, or were not present during data collection). This is not a problem, because as long as at least one pupil in each class has teacher experience recorded, we can deductively impute the correct (i.e. true) value for every pupil in the class.


Comparing the intraclass correlations

12. Compare the ICC’s of the variables in the first imputed dataset with those in the incomplete dataset (use popular, popteach and texp). Make a notation of the ICC’s after imputation.

data.frame(vars = names(popNCR[c(6, 7, 5)]), 
           observed = c(icc(aov(popular ~ class, popNCR)), 
                        icc(aov(popteach ~ class, popNCR)), 
                        icc(aov(texp ~ class, popNCR))), 
           norm     = c(icc(aov(popular ~ class, complete(imp1))), 
                        icc(aov(popteach ~ class, complete(imp1))), 
                        icc(aov(texp ~ class, complete(imp1)))))
##       vars  observed      norm
## 1  popular 0.3280070 0.2785369
## 2 popteach 0.3138658 0.2542062
## 3     texp 1.0000000 0.4324512

Incorporating the fixed effects


13. Now impute the popNCR dataset again with mice using imputation method norm for popular, popteach, texp and extrav, but now include class as a predictor for all variables. Call the mids-object imp2.

pred <- make.predictorMatrix(popNCR)
pred[, "pupil"] <- 0
imp2 <- mice(popNCR, meth = meth, pred = pred, print = FALSE)
## Warning: Number of logged events: 90

We exclude pupil here to avoid overfitting our data with the pupil identifier. Including pupil would result in a model with zero residual variance.


Comparing the ICC’s


14. Compare the ICC’s of the variables in the first imputed dataset from imp2 with those of imp1 and the incomplete dataset (use popular, popteach and texp). Make a notation of the ICC’s after imputation.

data.frame(vars = names(popNCR[c(6, 7, 5)]), 
           observed  = c(icc(aov(popular ~ class, popNCR)), 
                         icc(aov(popteach ~ class, popNCR)), 
                         icc(aov(texp ~ class, popNCR))), 
           norm      = c(icc(aov(popular ~ class, complete(imp1))), 
                         icc(aov(popteach ~ class, complete(imp1))), 
                         icc(aov(texp ~ class, complete(imp1)))), 
           normclass = c(icc(aov(popular ~ class, complete(imp2))), 
                         icc(aov(popteach ~ class, complete(imp2))), 
                         icc(aov(texp ~ class, complete(imp2)))))
##       vars  observed      norm normclass
## 1  popular 0.3280070 0.2785369 0.3670845
## 2 popteach 0.3138658 0.2542062 0.3347250
## 3     texp 1.0000000 0.4324512 1.0000000

By simply forcing the algorithm to use the class variable during estimation we adopt a fixed effects approach. This conforms to formulating seperate regression models for each class and imputing within classes from these models.


Checking Convergence of the imputations

15. Inspect the trace lines for the variables popular, texp and extrav.

plot(imp2, c("popular", "texp", "popteach"))

It seems not quite OK. For example, there are clear trends for popteach and some of the streams do not even intermingle between iterations 4 and 5 for texp. Adding another 20 iterations gives a more convincing graph:

imp2b <- mice.mids(imp2, maxit = 20, print = FALSE)
plot(imp2b, c("popular", "texp", "popteach"))

We can use function mice.mids() to continue from the state where the object imp2 stopped. This is fortunate, as we do not have to re-run the first maxit = 5 default iterations.


Visually inspecting the completed data

16. Plot the densities of the observed and imputed data (use imp2) with the function densityplot().

To obtain all densities of the different imputed datasets use

densityplot(imp2)


Intermezzo


It is always wise to ask yourself the following questions when attacking a multilevel missing data problem - see Table 7.1 in Van Buuren (2018):


Q1. Will the complete-data model include random slopes?: Thus far we have not formulated any complicated analysis model, but if our analysis model would include random slopes than we should definitely account for those terms in our imputation model. For now, let’s assume that our analysis model remains: lme4::lmer(popular ~ (1 | class))


Q2. Will the data contain systematically missing values? With systematically missing data there are no observed values in the cluster

We can simply verify this by asking the length of the complete cases for each cluster

aggregate(. ~ class, data = popNCR, FUN=function(x) length(is.na(x)))
##    class pupil extrav sex texp popular popteach
## 1      1     4      4   4    4       4        4
## 2      2     4      4   4    4       4        4
## 3      3     4      4   4    4       4        4
## 4      4     1      1   1    1       1        1
## 5      5     7      7   7    7       7        7
## 6      6     5      5   5    5       5        5
## 7      8     5      5   5    5       5        5
## 8      9     5      5   5    5       5        5
## 9     11     2      2   2    2       2        2
## 10    12     3      3   3    3       3        3
## 11    13     2      2   2    2       2        2
## 12    14     3      3   3    3       3        3
## 13    15     4      4   4    4       4        4
## 14    16     3      3   3    3       3        3
## 15    17     3      3   3    3       3        3
## 16    18     3      3   3    3       3        3
## 17    19     4      4   4    4       4        4
## 18    20     2      2   2    2       2        2
## 19    21     3      3   3    3       3        3
## 20    22     4      4   4    4       4        4
## 21    23     2      2   2    2       2        2
## 22    24     6      6   6    6       6        6
## 23    25     1      1   1    1       1        1
## 24    26     3      3   3    3       3        3
## 25    27     2      2   2    2       2        2
## 26    28     5      5   5    5       5        5
## 27    30     3      3   3    3       3        3
## 28    31     1      1   1    1       1        1
## 29    32     7      7   7    7       7        7
## 30    33     2      2   2    2       2        2
## 31    34     6      6   6    6       6        6
## 32    35     3      3   3    3       3        3
## 33    36     4      4   4    4       4        4
## 34    37     1      1   1    1       1        1
## 35    38     4      4   4    4       4        4
## 36    39     2      2   2    2       2        2
## 37    40     3      3   3    3       3        3
## 38    41     1      1   1    1       1        1
## 39    42     6      6   6    6       6        6
## 40    43     5      5   5    5       5        5
## 41    44     3      3   3    3       3        3
## 42    45     8      8   8    8       8        8
## 43    46     2      2   2    2       2        2
## 44    47     1      1   1    1       1        1
## 45    48     3      3   3    3       3        3
## 46    49     4      4   4    4       4        4
## 47    50     1      1   1    1       1        1
## 48    51     2      2   2    2       2        2
## 49    52     7      7   7    7       7        7
## 50    53     1      1   1    1       1        1
## 51    54     4      4   4    4       4        4
## 52    55     4      4   4    4       4        4
## 53    56     6      6   6    6       6        6
## 54    57     3      3   3    3       3        3
## 55    58     2      2   2    2       2        2
## 56    59     7      7   7    7       7        7
## 57    60     7      7   7    7       7        7
## 58    61     1      1   1    1       1        1
## 59    62     2      2   2    2       2        2
## 60    63     5      5   5    5       5        5
## 61    65     1      1   1    1       1        1
## 62    66     1      1   1    1       1        1
## 63    67     1      1   1    1       1        1
## 64    68     3      3   3    3       3        3
## 65    69     2      2   2    2       2        2
## 66    70     1      1   1    1       1        1
## 67    71     2      2   2    2       2        2
## 68    72     1      1   1    1       1        1
## 69    73     8      8   8    8       8        8
## 70    74     2      2   2    2       2        2
## 71    75     1      1   1    1       1        1
## 72    76     2      2   2    2       2        2
## 73    77     5      5   5    5       5        5
## 74    78     2      2   2    2       2        2
## 75    79     1      1   1    1       1        1
## 76    80     2      2   2    2       2        2
## 77    81     6      6   6    6       6        6
## 78    82     6      6   6    6       6        6
## 79    84     1      1   1    1       1        1
## 80    85     1      1   1    1       1        1
## 81    86     3      3   3    3       3        3
## 82    87     9      9   9    9       9        9
## 83    88     4      4   4    4       4        4
## 84    89     1      1   1    1       1        1
## 85    90     6      6   6    6       6        6
## 86    91     3      3   3    3       3        3
## 87    92     3      3   3    3       3        3
## 88    93     3      3   3    3       3        3
## 89    94     2      2   2    2       2        2
## 90    95     3      3   3    3       3        3
## 91    96     4      4   4    4       4        4
## 92    97     2      2   2    2       2        2
## 93    99     1      1   1    1       1        1
## 94   100     4      4   4    4       4        4

None of the clusters class are completely unobserved.


Q3. Will the distribution of the residuals be non-normal?

fit <- with(popNCR, lmer(popular ~ (1 | class)))
res <- residuals(fit)
hist(res, main = "Residuals")

The residuals seem to be quite normally distributed.


Q4. Will the error variance differ over clusters?

dotplot(ranef(fit, condVar = TRUE))
## $class

The error variance seems quite constant from a quick visual inspection


Q5. Will there be small clusters?

aggregate(pupil ~ class, data = popNCR, length)
##     class pupil
## 1       1    20
## 2       2    20
## 3       3    18
## 4       4    23
## 5       5    21
## 6       6    20
## 7       7    21
## 8       8    20
## 9       9    20
## 10     10    24
## 11     11    22
## 12     12    17
## 13     13    20
## 14     14    17
## 15     15    24
## 16     16    17
## 17     17    26
## 18     18    21
## 19     19    20
## 20     20    20
## 21     21    23
## 22     22    20
## 23     23    22
## 24     24    19
## 25     25    20
## 26     26    21
## 27     27    20
## 28     28    20
## 29     29    17
## 30     30    17
## 31     31    19
## 32     32    23
## 33     33    19
## 34     34    22
## 35     35    20
## 36     36    19
## 37     37    19
## 38     38    17
## 39     39    16
## 40     40    20
## 41     41    19
## 42     42    21
## 43     43    17
## 44     44    19
## 45     45    19
## 46     46    23
## 47     47    17
## 48     48    19
## 49     49    20
## 50     50    20
## 51     51    23
## 52     52    23
## 53     53    19
## 54     54    19
## 55     55    21
## 56     56    19
## 57     57    18
## 58     58    20
## 59     59    17
## 60     60    21
## 61     61    19
## 62     62    20
## 63     63    25
## 64     64    19
## 65     65    22
## 66     66    20
## 67     67    21
## 68     68    17
## 69     69    18
## 70     70    21
## 71     71    22
## 72     72    21
## 73     73    20
## 74     74    17
## 75     75    17
## 76     76    18
## 77     77    19
## 78     78    18
## 79     79    20
## 80     80    20
## 81     81    21
## 82     82    21
## 83     83    18
## 84     84    21
## 85     85    22
## 86     86    22
## 87     87    21
## 88     88    23
## 89     89    18
## 90     90    23
## 91     91    18
## 92     92    17
## 93     93    20
## 94     94    16
## 95     95    21
## 96     96    21
## 97     97    21
## 98     98    21
## 99     99    23
## 100   100    20

We can see that the average cluster size is approximately 20 pupils in every cluster with no cluster having fewer than 16 pupils.


Q6. Will there be a small number of clusters? NO, there are 100 clusters


Q7. Will the complete-data model have cross-level interactions? No


Q8. Will the dataset be very large? No, only 2000 cases over 100 clusters


** End of intermezzo**


Replacing norm with pmm


17. Impute the popNCR data once more where you use predictive mean matching and exclude only pupil as a predictor. Name the object imp4.

imp4 <- mice(popNCR[, -1])
## 
##  iter imp variable
##   1   1  extrav  sex  texp  popular  popteach
##   1   2  extrav  sex  texp  popular  popteach
##   1   3  extrav  sex  texp  popular  popteach
##   1   4  extrav  sex  texp  popular  popteach
##   1   5  extrav  sex  texp  popular  popteach
##   2   1  extrav  sex  texp  popular  popteach
##   2   2  extrav  sex  texp  popular  popteach
##   2   3  extrav  sex  texp  popular  popteach
##   2   4  extrav  sex  texp  popular  popteach
##   2   5  extrav  sex  texp  popular  popteach
##   3   1  extrav  sex  texp  popular  popteach
##   3   2  extrav  sex  texp  popular  popteach
##   3   3  extrav  sex  texp  popular  popteach
##   3   4  extrav  sex  texp  popular  popteach
##   3   5  extrav  sex  texp  popular  popteach
##   4   1  extrav  sex  texp  popular  popteach
##   4   2  extrav  sex  texp  popular  popteach
##   4   3  extrav  sex  texp  popular  popteach
##   4   4  extrav  sex  texp  popular  popteach
##   4   5  extrav  sex  texp  popular  popteach
##   5   1  extrav  sex  texp  popular  popteach
##   5   2  extrav  sex  texp  popular  popteach
##   5   3  extrav  sex  texp  popular  popteach
##   5   4  extrav  sex  texp  popular  popteach
##   5   5  extrav  sex  texp  popular  popteach
## Warning: Number of logged events: 90

Visually inspecting the completed data

18. Plot again the densities of the observed and imputed data with the function densityplot(), but now use imp4. Is there a difference between the imputations obtained with pmm and norm and can you explain this?

densityplot(imp4)

Yes, pmm samples from the observed values and this clearly shows: imputations follow the shape of the observed data.


Comparing the ICC’s

19. Compare the ICC’s of the variables in the first imputed dataset from imp4 with those of imp1, imp2 and the incomplete dataset (use popular, popteach and texp).

See Exercise 12 for the solution.


20. Finally, compare the ICC’s of the imputations to the ICC’s in the original data. The original data can be found in dataset popular. What do you conclude?

data.frame(vars      = names(popNCR[c(6, 7, 5)]), 
           observed  = c(icc(aov(popular ~ class, popNCR)), 
                         icc(aov(popteach ~ class, popNCR)), 
                         icc(aov(texp ~ class, popNCR))), 
           norm      = c(icc(aov(popular ~ class, complete(imp1))), 
                         icc(aov(popteach ~ class, complete(imp1))), 
                         icc(aov(texp ~ class, complete(imp1)))), 
           normclass = c(icc(aov(popular ~ class, complete(imp2))), 
                         icc(aov(popteach ~ class, complete(imp2))), 
                         icc(aov(texp ~ class, complete(imp2)))), 
           pmm       = c(icc(aov(popular ~ class, complete(imp4))), 
                         icc(aov(popteach ~ class, complete(imp4))), 
                         icc(aov(texp ~ class, complete(imp4)))), 
           orig      = c(icc(aov(popular ~ as.factor(class), popular)), 
                         icc(aov(popteach ~ as.factor(class), popular)), 
                         icc(aov(texp ~ as.factor(class), popular))))
##       vars  observed      norm normclass       pmm      orig
## 1  popular 0.3280070 0.2785369 0.3670845 0.3701673 0.3629933
## 2 popteach 0.3138658 0.2542062 0.3347250 0.3433643 0.3414766
## 3     texp 1.0000000 0.4324512 1.0000000 1.0000000 1.0000000

Note: these display only the first imputed data set.


True multilevel imputation


Mice includes several imputation methods for imputing multilevel data:

  • 2l.norm: Imputes univariate missing data using a two-level normal model with heterogeneous within group variances
  • 2l.pan: Imputes univariate missing data using a two-level normal model with homogeneous within group variances
  • 2l.lmer: Imputes univariate systematically and sporadically missing data using a two-level normal model using lme4::lmer()
  • 2l.bin: Imputes univariate systematically and sporadically missing data using a two-level logistic model using lme4::glmer()
  • 2lonly.mean: Imputes the mean of the class within the class
  • 2lonly.norm: Imputes univariate missing data at level 2 using Bayesian linear regression analysis
  • 2lonly.pmm: Imputes univariate missing data at level 2 using predictive mean matching

The latter two methods aggregate level 1 variables at level 2, but in combination with mice.impute.2l.pan, allow switching regression imputation between level 1 and level 2 as described in Yucel (2008) or Gelman and Hill (2006, p. 541). For more information on these imputation methods see the help.

There are also many useful imputation routines available from packages miceadds and micemd. See Tables 7.2, 7.3 and 7.4 in Van Buuren (2018) for a detailed overview of these functions.


Imputing the data

21. Impute the variable popular by means of 2l.norm. Use dataset popNCR2.

ini <- mice(popNCR2, maxit = 0)
pred <- ini$pred
pred["popular", ] <- c(0, -2, 2, 2, 2, 0, 2)

In the predictor matrix, -2 denotes the class variable, a value 1 indicates a fixed effect and a value 2 indicates a random effect. However, the currently implemented algorithm does not handle predictors that are specified as fixed effects (type = 1). When using mice.impute.2l.norm(), the current advice is to specify all predictors as random effects (type = 2).

meth <- ini$meth
meth <- c("", "", "", "", "", "2l.norm", "")
imp5 <- mice(popNCR2, pred = pred, meth=meth, print = FALSE)

Inspecting the imputations

22. Inspect the imputations. Did the algorithm converge?

densityplot(imp5, ~popular, ylim = c(0, 0.35), xlim = c(-1.5, 10))

densityplot(imp4, ~popular, ylim = c(0, 0.35), xlim = c(-1.5, 10))

The imputations generated with 2l.norm are very similar to the ones obtained by pmm with class as a fixed effect. If we plot the first imputed dataset from imp4 and imp5 against the original (true) data:

plot(density(popular$popular))  #true data 
lines(density(complete(imp5)$popular), col = "red", lwd = 2)  #2l.norm
lines(density(complete(imp4)$popular), col = "green", lwd = 2)  #PMM

We can see that the imputations are very similar. When studying the convergence

plot(imp5)

we conclude that it may be wise to run additional iterations. Convergence is not apparent from this plot.

imp5.b <- mice.mids(imp5, maxit = 10, print = FALSE)
plot(imp5.b)

After running another 10 iterations, convergence is more convincing.


Homogeneous group variances

23. In the original data, the group variances for popular are homogeneous. Use 2l.pan to impute the variable popular in dataset popNCR2. Inspect the imputations. Did the algorithm converge?

ini <- mice(popNCR2, maxit = 0)
pred <- ini$pred
pred["popular", ] <- c(0, -2, 2, 2, 1, 0, 2)
meth <- ini$meth
meth <- c("", "", "", "", "", "2l.pan", "")
imp6 <- mice(popNCR2, pred = pred, meth = meth, print = FALSE)

Let us create the densityplot for imp6

densityplot(imp6, ~popular, ylim = c(0, 0.35), xlim = c(-1.5, 10))

and compare it to the one for imp4

densityplot(imp4, ~popular, ylim = c(0, 0.35), xlim = c(-1.5, 10))

If we plot the first imputed dataset from both objects against the original (true) density, we obtain the following plot:

plot(density(popular$popular), main = "black = truth | green = PMM | red = 2l.pan")  # 
lines(density(complete(imp6)$popular), col = "red", lwd = 2)  #2l.pan
lines(density(complete(imp4)$popular), col = "green", lwd = 2)  #PMM

We can see that the imputations are very similar. When studying the convergence

plot(imp6)

we conclude that it may be wise to run additional iterations. Convergence is not apparent from this plot.

imp6.b <- mice.mids(imp5, maxit = 10, print = FALSE)
plot(imp6.b)

Again, after running another 10 iterations, convergence is more convincing.


Multiple imputaiton of popNCR3


True multilevel imputation

24. Now inspect dataset popNCR3 and impute the incomplete variables according to the following imputation methods:

Variable Method
extrav 2l.norm
texp 2lonly.mean
sex 2l.bin
popular 2l.pan
popteach 2l.lmer
ini <- mice(popNCR3, maxit = 0)
pred <- ini$pred
pred["extrav", ] <- c(0, -2, 0, 2, 2, 2, 2)  #2l.norm
pred["sex", ] <- c(0, -2, 1, 0, 1, 1, 1)  #2l.bin
pred["texp", ] <- c(0, -2, 1, 1, 0, 1, 1)  #2lonly.mean
pred["popular", ] <- c(0, -2, 2, 2, 1, 0, 2)  #2l.pan
pred["popteach", ] <- c(0, -2, 2, 2, 1, 2, 0)  #2l.lmer
meth <- ini$meth
meth <- c("", "", "2l.norm", "2l.bin", "2lonly.mean", "2l.pan", "2l.lmer")
imp7 <- mice(popNCR3, pred = pred, meth = meth, print = FALSE)
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
fit <- with(imp7, lmer(popular ~ 1 + (1|class)))
testEstimates(as.mitml.result(fit), var.comp = TRUE)
## 
## Call:
## 
## testEstimates(model = as.mitml.result(fit), var.comp = TRUE)
## 
## Final parameter estimates and inferences obtained from 5 imputed data sets.
## 
##              Estimate Std.Error   t.value        df   P(>|t|)       RIV       FMI 
## (Intercept)     5.043     0.088    57.591 53859.428     0.000     0.009     0.009 
## 
##                            Estimate 
## Intercept~~Intercept|class    0.699 
## Residual~~Residual            1.218 
## ICC|class                     0.364 
## 
## Unadjusted hypothesis test as appropriate in larger samples.

Evaluating the imputations

25. Evaluate the imputations by means of convergence, distributions and plausibility.

densityplot(imp7)

stripplot(imp7)

Given what we know about the missingness, the imputed densities look very reasonable.

plot(imp7)

Convergence has not yet been reached. more iterations are advisable.


pmm imputation with class as a predictor

26 . Repeat the same imputations as in the previous step, but now use pmm for all non-binary incomplete variables. Include class as a predictor.

pmmdata <- popNCR3
pmmdata$class <- as.factor(popNCR3$class)
imp8 <- mice(pmmdata, m = 5, print = FALSE)
## Warning: Number of logged events: 90

A warning is printed that there are 90 logged events. When we inspect

imp8$loggedEvents
##    it im      dep   meth  out
## 1   1  1  popular    pmm texp
## 2   1  1 popteach    pmm texp
## 3   1  2  popular    pmm texp
## 4   1  2 popteach    pmm texp
## 5   1  3  popular    pmm texp
## 6   1  3 popteach    pmm texp
## 7   1  4  popular    pmm texp
## 8   1  4 popteach    pmm texp
## 9   1  5  popular    pmm texp
## 10  1  5 popteach    pmm texp
## 11  2  1   extrav    pmm texp
## 12  2  1      sex logreg texp
## 13  2  1  popular    pmm texp
## 14  2  1 popteach    pmm texp
## 15  2  2   extrav    pmm texp
## 16  2  2      sex logreg texp
## 17  2  2  popular    pmm texp
## 18  2  2 popteach    pmm texp
## 19  2  3   extrav    pmm texp
## 20  2  3      sex logreg texp
## 21  2  3  popular    pmm texp
## 22  2  3 popteach    pmm texp
## 23  2  4   extrav    pmm texp
## 24  2  4      sex logreg texp
## 25  2  4  popular    pmm texp
## 26  2  4 popteach    pmm texp
## 27  2  5   extrav    pmm texp
## 28  2  5      sex logreg texp
## 29  2  5  popular    pmm texp
## 30  2  5 popteach    pmm texp
## 31  3  1   extrav    pmm texp
## 32  3  1      sex logreg texp
## 33  3  1  popular    pmm texp
## 34  3  1 popteach    pmm texp
## 35  3  2   extrav    pmm texp
## 36  3  2      sex logreg texp
## 37  3  2  popular    pmm texp
## 38  3  2 popteach    pmm texp
## 39  3  3   extrav    pmm texp
## 40  3  3      sex logreg texp
## 41  3  3  popular    pmm texp
## 42  3  3 popteach    pmm texp
## 43  3  4   extrav    pmm texp
## 44  3  4      sex logreg texp
## 45  3  4  popular    pmm texp
## 46  3  4 popteach    pmm texp
## 47  3  5   extrav    pmm texp
## 48  3  5      sex logreg texp
## 49  3  5  popular    pmm texp
## 50  3  5 popteach    pmm texp
## 51  4  1   extrav    pmm texp
## 52  4  1      sex logreg texp
## 53  4  1  popular    pmm texp
## 54  4  1 popteach    pmm texp
## 55  4  2   extrav    pmm texp
## 56  4  2      sex logreg texp
## 57  4  2  popular    pmm texp
## 58  4  2 popteach    pmm texp
## 59  4  3   extrav    pmm texp
## 60  4  3      sex logreg texp
## 61  4  3  popular    pmm texp
## 62  4  3 popteach    pmm texp
## 63  4  4   extrav    pmm texp
## 64  4  4      sex logreg texp
## 65  4  4  popular    pmm texp
## 66  4  4 popteach    pmm texp
## 67  4  5   extrav    pmm texp
## 68  4  5      sex logreg texp
## 69  4  5  popular    pmm texp
## 70  4  5 popteach    pmm texp
## 71  5  1   extrav    pmm texp
## 72  5  1      sex logreg texp
## 73  5  1  popular    pmm texp
## 74  5  1 popteach    pmm texp
## 75  5  2   extrav    pmm texp
## 76  5  2      sex logreg texp
## 77  5  2  popular    pmm texp
## 78  5  2 popteach    pmm texp
## 79  5  3   extrav    pmm texp
## 80  5  3      sex logreg texp
## 81  5  3  popular    pmm texp
## 82  5  3 popteach    pmm texp
## 83  5  4   extrav    pmm texp
## 84  5  4      sex logreg texp
## 85  5  4  popular    pmm texp
## 86  5  4 popteach    pmm texp
## 87  5  5   extrav    pmm texp
## 88  5  5      sex logreg texp
## 89  5  5  popular    pmm texp
## 90  5  5 popteach    pmm texp

We find that texp has been excluded as a predictor in 90 instances. This is to be expected because when class is added as a factor (categorical variable) to the model, a seperate model will be fitted for each class. In each of these models, observed texp is a constant and, hence, will automatically be removed by mice to avoid estimation problems because of a redundant parameter.

With pmm, the imputations are very similar and conform to the shape of the observed data.

densityplot(imp8)

stripplot(imp8)

When looking at the convergence of pmm, more iterations are advisable:

plot(imp8)

fit <- with(imp8, lmer(popular ~ 1 + (1|class)))
testEstimates(as.mitml.result(fit), var.comp = TRUE)
## 
## Call:
## 
## testEstimates(model = as.mitml.result(fit), var.comp = TRUE)
## 
## Final parameter estimates and inferences obtained from 5 imputed data sets.
## 
##              Estimate Std.Error   t.value        df   P(>|t|)       RIV       FMI 
## (Intercept)     5.047     0.088    57.136 21545.583     0.000     0.014     0.014 
## 
##                            Estimate 
## Intercept~~Intercept|class    0.710 
## Residual~~Residual            1.188 
## ICC|class                     0.374 
## 
## Unadjusted hypothesis test as appropriate in larger samples.

If the primary interest is on the fixed effects, using pmm in a fixed effects setup where the class is added as a cluster dummy variable, may be an easily implementable alternative to running a full multilevel model: see e.g. Vink, Lazendic & Van Buuren, 2015. In our example, when compared to the true data set, parameters for the model popular ~ 1 + (1|class) are unbiased. However, some authors have reported that the bias in random slopes and variance components can be substantial. If the focus lies on the random components in the model, a parametric full multilevel imputation would be more efficient if it is properly fitted. See Drechsler (2015), Lüdtke, Robitzsch, and Grund (2017) and Speidel, Drechsler, and Sakshaug (2017) for more detail on using cluster dummies in multilevel imputation and estimation.


Conclusions

There are ways to ensure that imputations are not just ‘guesses of unobserved values’. Imputations can be checked by using a standard of reasonability - see Aboyami, Gelman, and Levy (2010) for a wonderful introduction to diagnostics for multiple imputations. We are able to check the differences between observed and imputed values, the differences between their distributions as well as the distribution of the completed data as a whole. If we do this, we can see whether imputations make sense in the context of the problem being studied.


References

Abayomi, K. , Gelman, A. and Levy, M. (2008), Diagnostics for multivariate imputations. Journal of the Royal Statistical Society: Series C (Applied Statistics), 57: 273-291. Article

Drechsler, J. (2015). Multiple Imputation of Multilevel Missing Data: Rigor Versus Simplicity. Journal of Educational and Behavioral Statistics 40 (1): 69–95. Article

Gelman, A., & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.

Hox, J. J., Moerbeek, M., & van de Schoot, R. (2010). Multilevel analysis: Techniques and applications. Routledge.

Lüdtke, O., Robitzsch, A., & Grund, S. (2017). Multiple imputation of missing data in multilevel designs: A comparison of different strategies. Psychological Methods, 22(1), 141-165. Article

Speidel, M., J. Drechsler, and J. W. Sakshaug. (2017). Biases in Multilevel Analyses Caused by Cluster-Specific Fixed-Effects Imputation. Behavior Research Methods, 1–17. Article

Van Buuren, S. Flexible imputation of missing data. Chapman & Hall/CRC. 2018. online book

Vink, G., Lazendic, G., Van Buuren, S. (2015). Psychological Test and Assessment Modeling, volume 57, issue 4, pp. 577 - 594. Article

Yucel, R. M. (2008). Multiple imputation inference for multivariate multilevel continuous data with ignorable non-response. Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 366(1874), 2389-2403. Article


- End of practical